TRiM: Tensor Reduction in Memory
نویسندگان
چکیده
Personalized recommendation systems are gaining significant traction due to their industrial importance. An important building block of consists what is known as the embedding layers, which exhibit a highly memory-intensive characteristics. Fundamental primitives layers vector gathers followed by reductions, low arithmetic intensity and becomes bottlenecked memory throughput. To address this issue, recent proposals in research space employ near-data processing (NDP) solution at DRAM rank-level, achieving performance speedup. We observe that prior NDP solutions based on rank-level parallelism leave left table, they do not fully reap abundant data transfer throughput inherent datapaths. Based observation datapath has hierarchical tree structure, we propose novel, fine-grained architecture for systems, augments with an “in-DRAM” reduction unit DDR4/5 rank/bank-group/bank level, improvements over state-of-the-art approaches. also hot embedding-vector replication alleviate load imbalance across units.
منابع مشابه
Ship Trim Optimization for the Reduction of Fuel Consumption
Greenhouse gas emissions and atmospheric pollutants, economic savings, as well as alignment with the new rules of the shipping industry's leading is a new concern. In order to meet and achieve these goals, many efforts have been made. In this paper, some methods for reducing fuel consumption, including optimization of floating trim, have been considered in design draught. In this regard, the bo...
متن کاملTensor sufficient dimension reduction.
Tensor is a multiway array. With the rapid development of science and technology in the past decades, large amount of tensor observations are routinely collected, processed, and stored in many scientific researches and commercial activities nowadays. The colorimetric sensor array (CSA) data is such an example. Driven by the need to address data analysis challenges that arise in CSA data, we pro...
متن کاملThe Tensor Memory Hypothesis
We discuss memory models which are based on tensor decompositions using latent representations of entities and events. We show how episodic memory and semantic memory can be realized and discuss how new memory traces can be generated from sensory input: Existing memories are the basis for perception and new memories are generated via perception. We relate our mathematical approach to the hippoc...
متن کاملReduction in Cache Memory Power Consumption based on Replacement Quantity
Today power consumption is considered to be one of the important issues. Therefore, its reduction plays a considerable role in developing systems. Previous studies have shown that approximately 50% of total power consumption is used in cache memories. There is a direct relationship between power consumption and replacement quantity made in cache. The less the number of replacements is, the less...
متن کاملReduction in Cache Memory Power Consumption based on Replacement Quantity
Today power consumption is considered to be one of the important issues. Therefore, its reduction plays a considerable role in developing systems. Previous studies have shown that approximately 50% of total power consumption is used in cache memories. There is a direct relationship between power consumption and replacement quantity made in cache. The less the number of replacements is, the less...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Computer Architecture Letters
سال: 2021
ISSN: ['2473-2575', '1556-6056', '1556-6064']
DOI: https://doi.org/10.1109/lca.2020.3042805